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Bialek, Callan and Strong have recently given a solution of the problem of determining a con- 
tinuous probability distribution from a finite set of experimental measurements by formulating it as 
a one- dimensional quantum field theory. This letter gives a reparametrization-invariant solution of 
the problem, obtained by coupling to gravity. The case of a large number of dimensions may involve 
quantum gravity restricted to metrics of vanishing Weyl curvature. 
In many instances in physics, or in other fields, it is necessary to determine the probability distribution that underlies 
a finite set of experimental results involving continuous variables. From the practical point of view, one usually has a 
definite finite-parameter model for the probability distribution on theoretical grounds, and the parameters are fixed 
by fitting to the data set. This introduces a theoretical bias, so it is of interest to attempt a direct determination 
of the probability distribution without resorting to finite-dimensional approximations. Of course, for any finite data 
set, one obtains only a probabilistic description of the probability distribution, but the spread of possible probability 
distributions decreases as the size of the data set is increased [jjj . 

Bialek, Callan and Strong [g| have recently given an elegant formulation of this problem in one dimension. They 
used Bayes' rule to write the probability of the probability distribution Q, given the data {xi}, as 
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where the factors Q(xi) arise because each Xi is chosen independently from the distribution Q(x), and P[Q] encodes 
the a priori hypotheses about the form of Q. The optimal least-square estimate of Q, Qest^x, {xi}), is then 
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where (• • -)(°) denotes an expectation value with respect to the a priori distribution P[Q(x)]. 

In this field-theoretic setting, Ref. || assumed that the prior distribution P[Q] should penalize large gradients, so 
written in terms of an unconstrained variable <f> = — ln(£Q) £ (—00, +00), they assumed 
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with £ a parameter that they later averaged over independently. This form for the prior distribution is very simple, 
and quite minimal in terms of underlying assumptions, so conclusions drawn from it should be robust. There is, 
however, an important aspect in which this prior distribution does not measure up, so at this juncture my analysis 
deviates from Ref. ||. 

The parameter £ is a global variable, independent of x in the analysis of Ref. ||, whereas there are some reasons 
for wishing to keep this scale parameter local, as I now discuss. We expect that the paramctrization of the data {xi} 
should not affect the probability distribution that we determine from it. To be more precise, we expect that if f(x) 
is a monotone function of x, then {/(£«)} should determine a probability distribution Qf which is related to Q x by 

/•/(*+) 
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for arbitrary values of x±; in words, the estimated distributions must be covariant with respect to reparametrizations 
of the data. The assumed form of P[Q] in Ref. (2) does not possess this reparametrization-covariance, so my aim here 
is to change their formulation to find a solution that does. The reparametrization invariance of results derived from 
the data is, obviously, high on the list of desiderata for this problem. 

In fact, the resolution of this problem is implicit in Ref. ||, if one takes care to read their discussion regarding the 
determination of £, as well as their discussion regarding the error in determining the correct probability distribution 
from a finite data set. As observed there, the introduction of I is really a determination of a scale on which variations 
of Q are viewed as too rapid. We should not expect that this length scale is a global length scale, indeed, quite 
intuitively one would like to allow derivatives of different magnitudes in Q in regions where data points cluster or are 
sparse. 
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From a particle theory perspective, these two problems, that of reparametrization invariance, and that of a local 
scale determination, signal the need to introduce a metric, in other words, to couple the field <f> to gravity. I shall 
show in the following that this coupling to 'quantum' gravity makes the analysis easy, leading to a final result that 
could not take a simpler form. 

I write Q(x) = y/h(x) exp(—<f>(x))/£, with 
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since in one dimension the metric, h(x), is a section of a real line bundle, hence in any fixed coordinate system, is 
just a 'function'. Note that I can be absorbed in y/h(x), so we set I = 1 in the following. The inverse of the metric is 
just l/h(x) and the reparametrization invariant volume element is *Jh(x)dx, so P[4>, h] is reparametrization invariant. 
Now, we want to evaluate 
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where 
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Notice that the integral over all metrics has been divided by the volume of the group of orientation preserving 
diffcomorphisms, as is appropriate for a reparametrization invariant action. In one dimension, this division eliminates 
all but one degree of freedom from the metric. Further, there is no operational way to distinguish between the factor 
yjh(xi) and exp(— 4>(xi)) — in other words, these must occur together in Q(xi). 
The equations of motion that follow from varying S are 
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where I have used primes to denote d/dx. Introduce a variable 

y(x) = ( Jh(sjds, (12) 
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then it follows from eq. || and eq. that 
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It is now necessary to be careful about the limits of integration. Similar care was needed in the analysis of Ref . [0 , 
outside the region where the data {x{\ lies. It is interesting to note that this care in the boundary terms is unnecessary 
at N — oo, in accord with the fact that any finite data set will clearly not indicate the true limits of the probability 
distribution. Suppose that x ranges from x_ to x + , then integrating eq. [l^ I find 

N — iX— ( + efe(— 1=0')' = , (14) 

so since 
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it follows that 



exp(+0) = (y+ - (16) 
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where c is an arbitrary constant of integration, c is restricted only by c > y+ > y_, or y+ > y_ > c, since exp(— 0) 
must be positive. This 'cyclic' constraint on y+, y_, c is a hint of projective invariance, expected in this theory because 
the algebra of infinitesimal reparametrizations of the line has a subalgebra isomorphic to sZ(2,R), familiar territory 
for string theorists. 

Finally, we need to determine y{x), which satisfies 
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Observe that the cross ratio on the left is projectively invariant, i.e., invariant under transformations of the form 
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with a, b, c, d real. This amounts to a three-parameter family of equivalent solutions y(x) determined by the data. 
This projective invariance can be fixed by setting c = oo, y+ = 1 and y_ = 0. In this case, <j> = 0! 

This is quite remarkable. What we have just found is that the saddlepoint probability distribution deduced from 
the data {x{\ is given by a function Q{x) — y'{x), where y is that reparametrization of the interval (x—, x+) such that 
{y(xi)} are uniformly distributed between and 1, in other words, 4>{y) = 0. This solution is manifestly reparametriza- 
tion invariant, since the solution has picked a canonical set of coordinates. Given any reparametrization of the data, 
the same canonical coordinates will be found, hence the probability distributions deduced in the two sets of coordinates 
will automatically be reparametrization-covariant, as desired. 

I omit the remaining steps in the semi-classical analysis since they are clearly explained in Ref. [0. The slight 
complication here is in explicitly treating the division by the volume of the diffeomorphism group, but this is easily 
handled in one dimension by the Faddeev-Popov method. 

Of considerable interest is the generalization of the reparametrization invariant theory given above to the case of 
dimensions much larger than one |3). It seems clear that we should not be integrating over all possible metrics since 
there are entirely too many degrees of freedom in such an integration in contrast to integrating over metrics in one, 
two, or three dimensions. A suitable constraint might be to integrate only over metrics with vanishing Weyl curvature, 
which are metrics such that there exist coordinates in which the metric is proportional to a constant matrix. In fact, 
counting local degrees of freedom, it would appear that it is unnecessary to even introduce a function <j> into the 
analysis for dimensions larger than three; there are exactly enough degrees of freedom in metrics with vanishing Weyl 
curvature to leave a local scalar degree of freedom after dividing by the diffeomorphism group. This is the same count 
that was used in one dimension, so it suggests that the correct generalization to higher dimensions is given by 

P[h] = - cxp (- J d D x^fh^)R{Kp{x))\ 5 (l - J d D x^/h(xfj , (19) 

where R(h a p) is the Ricci scalar constructed out of the metric h a p, constrained to be a metric of vanishing Weyl 
curvature, and yh = det l / 2 {h a pi) as usual. However, the analysis of the equations that follow from this ansatz is more 
complicated because the constraint of vanishing Weyl curvature contributes a source term in the equation analogous 
to eq. |9| above. Further, while the weak-field limit of R seems to give the correct damping of gradients, it is unclear if 
there is a positivity theorem that could be proved for R(h a p), subject to the constraint of vanishing Weyl curvature. 
This is under investigation. 
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